Born-Digital Reboot: Workflows

Simplified Flow Diagram of Activities: Acid Bessemer Steel-Making Plant (Brian W. Hollocks Collection, 1961-1982, MC00653)

Simplified Flow Diagram of Activities: Acid Bessemer Steel-Making Plant (Brian W. Hollocks Collection, 1961-1982, MC00653)

As promised in a previous post, we'll be writing about the process of overhauling our approach to dealing with born-digital archival materials as we go along this spring and summer (and maybe fall). In this post, we're going to share a little bit about our thinking on redesigning workflows!

Before we get to that, we should segue to the application we use (and are redesigning) to guide people through a born-digital processing session. As I spoke/wrote about in the presentation and case study BitCurator: Beyond Environment:

DAEV (Digital Assets of Enduring Value) guides a processor through a session, providing explicit instructions on actions to take; records processor actions and generates files containing preservation metadata about those actions; associates a processing session with an ArchivesSpace archival object record; and creates a digital object record for a session’s archival package and associates it with the appropriate archival object record.

Here's a screen capture of DAEV showing a processor how to perform the step where they will create a TAR file with the contents of an optical disc. What this shows are the commands a processor runs to complete that step.

Screenshot of TAR step in DAEV
The TAR step in DAEV

The expectation is that a processor will work on born-digital materials, and DAEV will help out, but also perform some very necessary functions around tracking and automated metadata creation. Processing without DAEV rarely happens. DAEV is a thing of wonder, but the workflows written into DAEV are old and they don't provide much flexibility to the processor using it. Therefore, to update our workflows is to update DAEV, also.

But what's wrong with our current workflows? Not that much, really. Our approach is, generally, to "package" files and then to assess content using several reporting tools. This works just fine in most cases, and especially "simple" cases where the files from a single transfer (e.g., files received via Google Drive) or those from a single media object (e.g., floppy disk, optical disc) should all be kept and represented by a single descriptive record in a finding aid. However, not all cases are simple.

"Complex" cases include those where a single transfer or media object contains duplicates that we deem unnecessary to keep, or has files that should be represented by more than one descriptive record. Think of a thumb drive received from one donor containing files belonging in three different collections, spanning both archival records and manuscript materials (i.e., not official university records). Our peak example is what we refer to as the "Alexander Isley hard drive," which was an external drive containing digital files from Isley's various design projects. We separated this content out, bit-wise and intellectually, into several dozen distinct archival packages (e.g., the files and documentation of our work) and descriptive records. You can see the "arrangement" in the "Digital Portfolio" series in the guide to Isley's papers. (Side note that spoils a future post on description: I really wish the records of the digital materials had been collocated with the physical portfolio materials.) While we were able to process this item, its complexity was not captured by the workflows as structured in DAEV. We had to work "outside" of DAEV.

Our soon-to-be old approach is to compile content (create a TAR archive--like a ZIP file--or a disk image--a bit-by-bit copy of a media item) and then run and review virus, privacy, duplicate and other reports. Sometimes this has led to a process where we capture, then report, and then recapture and re-report. What we're headed toward is more like report (assess/appraise) and then capture. Except it will be more nuanced. And, DAEV is being written to have the flexibility we need for the various types of media and processing scenarios we come across.

More on DAEV in a future post!

To learn more about our collections, explore our Rare and Unique Digital Collections and our online collection guides. If you have any questions or are interested in accessing Special Collections materials, please contact us at library_specialcollections@ncsu.edu or submit a request online.